Identify Student Technical Details Provincial Approach to Student Information API

This section provides the technical details of how the identify student service works. This will help explain why certain results are returned.

Primary and Secondary Students

If the search finds a student that matches the search criteria, the student will be returned independently of any other result. In general if a Secondary Alberta Student Number was found but the Primary Alberta Student was not found, the Primary will be added to the results. In this case the Primary will only be potentially returned if the Secondary is returned also and not filtered out. For the most part the Primary will also have the same Quality and Rank as the best matching Secondary that was found. It is possible for the Primary to be filtered out and the Secondary returned but not the other way around. However, if the Primary matched the search criteria it will be returned independently of any Secondaries that may have been found.

Search Types

Depending on the search request, one of the following search types will be performed:

  • Birth Date
    If the "BirthDate" property is provided, the system will search for the student whose birth date "matches" the birth date provided in addition to matching name criteria.

    This type of search may return matches of any quality.
  • Approximate Age
    If the "ApproximateAge" property is provided (has a value greater than 0 and the "BirthDate" property is not provided), the system will perform an approximate age search. The system will search for students whose age is close to the approximate age provided in addition to matching name criteria.

    If the approximate age is between 0 and 5, the system will search for students with birth dates that occurred in the last 6 years. For example, if it is currently 2009, the system would search for students whose birth date falls between 2003 to 2009 inclusive.

    If the approximate age is more than 5, the system will determine the year the student was born in (the assumption is that the student's approximate age is the student's actual age) and use a year range that is +/- one year. For example if it is currently 2009 and the approximate age is 6, then the student was born in 2003. The system would search for students whose birth date falls between 2002 and 2004 inclusive.

    This type of search may only return matches of low quality.
  • Grade
    If the Grade property is provided (and the "BirthDate" and "ApproximateAge" properties aren't) the system will search for students whose grade is close to the grade provided in addition to the matching name criteria.

    To do this, the grade is mapped to an approximate age using the table below:

Grade

Age

EC

0 to 5

1

6

2

7

3

8

4

...

Once the approximate age is determined, the search is performed the same way an approximate age search is performed. For example, using the grade "2" (and not providing approximate age) in a search request is equivalent to using the approximate age "7."

This type of search may only return matches of low quality.

Match Types

Depending on the search type (as described above) the system may return three different types of matches:

  • High Quality
  • Medium Quality
  • Low Quality

Each match type indicates its relative importance in the search results. For example, high quality matches are more important than medium quality matches because the results found more closely match the criteria provided.

Match Status

The search results includes a status. The status may be one of the following:

  • No Match
    This status indicates no students were found; the results does not include any students.
  • Match
    This status indicates exactly one student was found; the results contain one student.
  • Ambiguous
    This status indicates more than one student was found; the results contain multiple students.
  • FoundStudentWithDisclosureRestriction
    This status indicates the search found students that have disclosure restriction; whether those students are returned in the results is determined by the IncludeStudentWithDisclosureRestriction property. If the IncludeStudentWithDisclosureRestriction property is set to true then the results will contain at least one student that has active disclosure restriction otherwise the results will not contain any. If the search found one or more students that have disclosure restriction, this status will be returned instead of Match or Ambiguous.

  • Insufficient
    This status indicates too many matches were found and the matches were essentially indistinguishable in terms of importance from each other. In this case no students are returned.
Include All Quality Levels

If the IncludeAllQualityLevels property on the request is set to false, the search results will only include the matches from the best quality matcher that returns something for a given search type. Additionally if the property is set to false only students with a Primary Alberta Student Number will be returned. If the property is set to true, the search results will include the matches from all matchers for a given search type including students with an Alberta Student Number that is Deactivated or Secondary.

For example, a Birth Date search uses all three matchers; high, medium, and low. High is considered the best quality matcher for this type of search. If IncludeAllQualityLevels is set to false, the search results will include high quality matches. If and only if there are no high quality matches the search results will include medium quality matches. If and only if there are no medium quality matches the search results will include low quality matches. If IncludeAllQualityLevels is set to true, the search results may include high, medium, and low quality matches.

Not all search types use all the matchers. For example, an approximate age search only uses the low quality matcher. In this case "low" is the best (and only) quality matcher for this type of search. Therefore, the IncludeAllQualityLevels property is not relevant when an approximate age search is performed.

Include Student With Disclosure Restriction

The IncludeStudentWithDisclosureRestriction property determines if the search results may include students that have active disclosure restriction. If the property is set to true, the results may include these students. If the property is set to false, the results will not contain any student that has disclosure restriction even if the student match the search criteria. In any case if students that have disclosure restriction are found the status returned will be FoundStudentWithDisclosureRestriction. Please see the business context documentation for when it is appropriate to set this property to true.

Matcher Indices

The PASI core maintains a number of "Matcher Indices" in order to search for students efficiently. The matchers use the following indices related to name criteria:

  • Name Exact
    Used to find students whose name(s) matches the name criteria exactly. For example, searching for "Bob" will find students with the name "Bob"
  • Name Starts With
    Used when the name criteria contains a single character to find students whose name(s) starts with that character. For example, searching for the name "B" will find students whose name(s) starts with "B" (e.x. "Bob")
  • Name Initial
    Used when the name criteria contains more than one character to find students whose name(s) is the first character. For example, searching for the name "Bob" will find students who have an initial name(s) of "B"
  • Name Phonetic
    Used to find students whose name(s) "sounds like" the name criteria. For example, searching for "Jaymes" will find students whose name(s) is "James" because "James" is phonetically equivalent to "Jaymes"
  • Name Alternate
    Used to find students whose name(s) matches a list of alternate names for the name criteria. For example, if "Roberto" and "Raibeart" are alternate names for "Bob"; searching for "Bob" will find students whose name(s) are "Roberto" or "Raibeart"

The matchers use the following indices related to birth date criteria:

  • Birth Date Exact
    Used to find students whose birth date matches the birth date criteria exactly. For example, searching for "March 7, 2003" will find students whose birth date is "March 7, 2003."
  • Birth Date Year
    Used to find students whose year of birth matches the year of the birth date criteria. For example, searching for "March 7, 2003" will find students that were born in 2003.
  • Birth Date Two Components
    Used to find students whose birth date matches two out of three components of the birth date criteria. For example, searching for "March 7, 2003" will find students whose birth date is any day in March 2003 (e.x. March 17, 2003), any year on March 7th (e.x. March 7, 1999), or the 7th day in any month in 2003 (e.x. June 7, 2003.)
High Quality Matcher

A high quality match is generally indicative of a successful search; that is the matcher found a student (or possibly students) that are likely to be the student(s) being sought. The high quality matcher uses the following indices:

  • Name Exact (for given and last names)
  • Birth Date Exact

Example criteria:

First Name: Jane

Middle Name: B

Last Name: Doe

Birth date: Feb 17, 2000

The above will find students where:

There exists a name ("Jane")

And there exists an initial ("B")

And there exists a name "Doe"

And the birth date is Feb 17, 2000

Medium Quality Matcher

A medium quality match is broader than a high quality match. This type of match provides a lower level of confidence that the student being sought for is in the list of students found. This generally indicates the search criteria provided may not exactly match what is found in the PASI core.

The medium quality matcher uses the following indices:

  • Name Exact (for given and last names)
  • Name Initial (for given names)
  • Name Starts With (for given names)
  • Name Alternate (for given names)
  • Birth Date Exact

If only two name parts are provided in the search criteria, the search is performed without name combinations.

Example criteria:

First Name: Bob

Last Name: Doe

Birth Date: Feb 17, 2000

If "Roberto" is an alternate name for "Bob" then the above will find students where:

There exists a name ("Bob" or "B" or "Roberto")

And there exists a name "Doe"

And the birth date is Feb 17, 2000

If three or more name parts are provided in the search criteria, the search is performed with name combinations. Each combination of two names in the search criteria is searched on.

Example criteria:

First Name: Mary-Lou

Last Name: Smith

Birth Date: Feb 17, 2000

If "Mayme" is an alternate name for "Mary" then the above will find students whose birth date is Feb 17, 2000 and at least one of the following conditions are met:

At least one name is ("Mary" or "Mayme" or "M") And at least one name is ("Lou" or "L")

At least one name is ("Mary" or "Mayme" or "M") And at least one name is "Smith"

At least one name is ("Lou" or "L") And at least one name is "Smith"

Low Quality Matcher

A low quality match is very broad compared to a medium quality match. This type of match includes students that only very loosely match the search criteria provided. This quality level generally signals there is a data quality issue. Some of the information in the search criteria does not match what is in the PASI core and it should be corrected. The low quality matcher uses the following indices:

  • Name Exact (for given and last names)
  • Name Initial (for given names)
  • Name Starts With (for given names)
  • Name Alternate (for given names)
  • Name Phonetic (for given and last names)
  • If the Birth Date is provided...
    Birth Date Exact, Birth Date Year (+/- 1 year), Birth Date Two Components
    Or
    If the Approximate Age or Grade is provided...
    Birth Date Year (+/- as determined by the approximate age or grade)

Example criteria (with birth date provided):

First Name: James

Last Name: Smith

Birth Date: Feb 17, 2000

If "Jaymes" is phonetically equivalent to "James" then the above will find students where:

There exists a name ("James" or "J" or "Jaymes")

And there exists a name "Smith"

In addition to the above the birth date must match students with birth dates that meet at least one of the following conditions:

The birth date is Feb 17, 2000

Or two of the birth date components match such as...

  • Feb <Any Day>, 2000 for example Feb 10, 2000
  • <Any Month> 17, 2000 for example Dec 17, 2000
  • Feb 17, <Any Year> for example Feb 17, 1987

Or the year of the birth date is between 1999 and 2001 inclusive

Example criteria (with approximate age provided):

First Name: James

Last Name: Smith

Approximate Age: 8

If the current year is 2009 and we assume the student's actual age is 8, then the student was born in 2001. The "birth date" criteria used in the search is the year range 2000 to 2002 inclusive. The above search will find students where:

There exists a name ("James" or "J" or "Jaymes")

And there exists a name "Smith"

And the year of the birth date is between 2000 and 2002 inclusive

Searching by grade is identical to searching by approximate age; in that case the grade is mapped to a year range and the search is performed as described above.

State Province ID "Search"

If a State Province ID is provided in the criteria and the results of the main search don't contain it, it is added to the results. If a State Province ID is provided, the quality level of all other State Province IDs in the results are reduced to low.

Ranking Results

The search results may include three lists of matches; high quality, medium quality, and low quality. Within each quality list each match is ranked according to how well it matched. A rank is a number between 0 and 1 inclusive. A rank of 1 is considered a "perfect" match; that is the student "perfectly" matched the search criteria on everything that was tested for a match. Although ranking is within each quality list, in general high quality matches will have higher ranks than medium quality matches and medium quality matches will have higher ranks than low quality matches.

The search criteria is used against the student record found in the database to determine the rank. In general, the better the criteria matched the higher the rank. For example, an exact match on a name is considered better than a match on just the first character on a name. For example, if we searched for the name "John", a student record that contained the name "John" would (if no other criteria was considered) rank higher than a student record that contained a name that just started with "J" like "Joe." The same is true for birth dates; an exact match on a birth date is considered better than a partial match on a birth date and so the former would (if no other criteria was considered) rank higher than the latter.

If one or more address properties are provided they are used to adjust the ranks; otherwise address properties are not used for searching. For example, if the city is provided, student records found that do not contain an address in that city will (if no other criteria was considered) rank lower than student records found that contain an address in that city.

Filtering Results

After ranking is completed the results are filtered. Secondary State Province Ids are filtered out. If a State Province Id is provided in the criteria and it exists in the system it is included in the results even if it is a Secondary State Province Id. The results are filtered based on rank to return no more than 20 matched students. In general, high quality matches are given preference over medium quality and medium quality matches are given preference over low quality matches. In the unlikely scenario that there are more than 20 matched students all with the same rank and a State Province Id was not provided in the criteria (or it was provided but doesn't exist), no students will be returned and the match status will be "insufficient."